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Abstract. We extend algorithmic information theory to quantum mechanics, taking a universal 
semicomputable density matrix ( "universal probability" ) as a starting point, and define complexity 
(an operator) as its negative logarithm. 

A number of properties of Kolmogorov complexity extend naturally to the new domain. Ap- 
' proximately, a quantum state is simple if it is within a small distance from a low-dimensional 

^ i subspace of low Kolmogorov complexity. The von Neumann entropy of a computable density ma- 

trix is within an additive constant from the average complexity. Some of the theory of randomness 
translates to the new domain. 

We explore the relations of the new quantity to the quantum Kolmogorov complexity defined 
by Vitanyi (we show that the latter is sometimes as large as 2n — 2 log n) and the qubit complexity 
defined by Berthiaume, Dam and Laplante. The "cloning" properties of our complexity measure 
are similar to those of qubit complexity. 
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1. Introduction 



Kolmogorov complexity (or by a more neutral name, description complexity) is an attractive 
concept, helping to shed light onto such subtle concepts as information content, randomness and 
inductive inference. Quantum information theory, a subject with its own conceptual difficulties, 
is attracting currently more attention than ever before, due to the excitement around quantum 
computing, quantum cryptography, and the many connections between these areas. The new interest 
is also spurring efforts to extend the theory of description complexity to the quantum setting: 
see |J, ||]. We continue these efforts in the hope that the correct notions will be found at the 
convergence of approaches from different directions. This has been the case for the theory of classical 
description complexity and randomness, What we expect from these researches is an eventual deeper 
understanding of quantum information theory itself. 

One of the starting points from wich it is possible to arrive at description complexity is Levin's 
concept of a universal semicomputable (semi)measure. We follow this approach in the quantum 
setting, where probability measures are generalized into density matrices. 

In contrast to the works ||, (!]] we do not find the notion of a quantum computer essential for 
this theory, even to the notions and results found in these works. The reason is that limitations 
on computing time do not play a role in the main theory of description complexity, and given 
enough time, a quantum computer can be simulated by a classical computer to any desired degree 
of precision. 

1.1. Notation. It seems that universal probability can also be defined in an infinite-dimensional 
space (it should be simple to extend the notions to Fock space), but we will confine ourselves to 
finite-dimensional spaces, in order to avoid issues of convergence and spectral representation for 
infinite-dimensional operators. Let us fix for each N a finite-dimensional Hubert space TCn, with a 
canonical orthonormal basis |/3i), . . . , \Pn}- (We do not use double index here, since we can assume 
that TLm C TLn+i and the canonical basis of Hn is also the beginning of that of Hn+i-) Let 
Qn = ®"=i Qi be the Hilbert space of n qubits. Let |0), |1) be some fixed orthonormal basis of Q\. 
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Let Z2 be the set of binary sequences of length n. If x e ZJ? then a; = (x(l),x(2), . . . , x(n)), and we 
write 

l(x) = n. 

We denote, as usual, for x £ Z£ : 

|x> = g)|i(i)). 

We identify Q„ with Ti.2«, with the canonical basis element \(3 X ) — \x). 

If we write ip or for a state then the corresponding element of the dual space will be written 
either as %/}' or as (ip\. Accordingly, the inner product can be written in three ways as 

<</#> = (0» =<t>H. 

As usual, we will sometimes write 

|0>®|^> = |0)|V> = |0.^). 

The operation Tr denotes trace, and over a tensor product space 7ix <8> fiy, the operation Try 
denotes partial trace. 

As usual, for self-adjoint operators p, a, let us write p ^ a if a — p is nonncgative definite. 

Let us call a quantum state |^>), with coefficients ^3i\ip) that are algebraic numbers, elementary. 
The reason for going to coefficients that are algebraic numbers is that this allows us the usual 
operations of linear algebra (orthogonalization, finding eigenvalues and eigenvectors) while remaining 
in the realm of elementary objects. 

Whenever we write U(p) — \<p) for a Turing machine U, we mean that U simply outputs the 
(algebraic definitions of the) coefficients of the elementary state \<j>). Similarly, let us call a self- 
adjoint operator T elementary if it is given by a matrix with algebraic entries. 

We will also write U(p) = \<j>) if U(p) outputs a sequence of tuples (cik, • ■ ■ , CNk) for k = 1, 2, ... , 
where Cik is an elementary approximation of ((3i\4>) to within 2~ fe . In this case, we say that \<f)) is 
a computable quantum state with program p. We can talk similarly about a program computing a 
linear operator on the finite-dimensional space, or even computing an infinite sequence |</>i), 02 ), ■ ■ ■ 
of states, in which case we output progressively better approximations to more and more elements 
of the sequence. 

+ . . . . . . * 

Let < denote inequality to within an additive constant, and < inequality to within a multiplicative 

constant. 

We assume that the reader knows the definition and simple properties of Kolmogorov complexity, 
even the definition of its prefix- free version K{x). For a reference, use |S|. 

1.2. Attempts to define a quantum Kolmogorov complexity. In Q, a notion of the description 
complexity of a quantum state was introduced. Though that definition uses quantum Turing ma- 
chines, this does not seem essential. Indeed, a quantum Turing machine can simulate a classical 
one. And if there is no restriction on computing time then any state output by a quantum Turing 
machine starting from . . . 0) can also be output with arbitrary approximation by some ordinary 
Turing machine. We reproduce the definition from Q as follows. For 6 H n , let 

Kq(|^) I N) = min{ l{p) - log {<j>\ ^) | 2 : U(p, N) = \<p) }. 

So, the complexity of \ip) is made up of the length of a program describing an approximation \<p) to 
\ip) and a term penalizing for bad approximation. It is proved in || that for \ip) € Q n , 

Kq(|^) I n) < In. 

The lower bounds given in that paper are close to n. The following theorem will be proved in 
Section 0. 

Theorem 1. For large enough n, there are states |^>) € Q n with Kq(|-0) | n) > 2n — 21ogn. 
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An entirely different approach to quantum Kolmogorov complexity is used in jy , where even the 
defining programs consist of qubits rather than ordinary bits. I will refer informally to complexity 
defined in jl| as "qubit complexity" . Despite the difference in some of the goals and basic definitions, 
still a number of results of that paper look somewhat similar to ours. 

1.3. This paper. The definition of Kq reflects the view that quantum states should not be accorded 
the status of individual outcomes of experiments, and therefore Kq strives only to approximate spec- 
ification. We go a little further, and approach quantum complexity using probability distributions 
to start with. We find a universal semicomputable (semi-) density matrix ("universal probability") 
and define a "complexity operator" as its negative logarithm. Depending on the order of taking the 
logarithm and the expectation, two possible complexities are introduced for a quantum state \ip): 

mm < Him- 

A number of properties of Kolmogorov complexity extend naturally to the new domain. Approx- 
imately, a quantum state is simple if it is within a small distance from a low-dimensional subspace 
of low Kolmogorov complexity. (Ideally, the three vague terms should play a role in the following 
decreasing order of significance: dimension, complexity, closeness.) This property can be used to 
relate our algorithmic entropy to both Vitanyi's complexity and qubit complexity. We find that H_ 
is within constant factor of Vitanyi's complexity, that H essentially lowerbounds qubit complexity 
and upperbounds an oracle version of qubit complexity. 

Though Vitanyi's complexity is typically close to 2n, while qubit complexity is < n, these are 
differences only within a constant factor; on the other hand, occasionally H_ can be much smaller 
than H and thus Vitanyi's complexity is occasionally much smaller than qubit complexity. This is 
due to the permissive way in which Vitanyi's complexity deals with approximations. 

The von Neumann entropy of a computable density matrix is within an additive constant from 
the average complexity. Some of the theory of randomness translates to the new domain, but new 
questions arise due to non-commutativity. 

The results on the maximal complexity of clones are sharp, and similar to those in |jj . 

2. Universal probability 

Let us call a nonnegative real function fix) defined on strings a semimeasure if fix) ^ 1, and a 
measure (a probability distribution) if the sum is 1. A function is called lower semicomputable if there 
is a monotonically increasing sequence g n ix) of functions converging to it such that in,x) i— > g n ix) 
is a computable function mapping into rational numbers. It is computable when it is both lower and 
upper semicomputable. (A lower semicomputable measure can be shown to be also computable.) 
The reason for introducing semicomputable semimeasures is not that computable measures are not 
felt general enough; rather, this step is analogous to the introduction of recursively enumerable 
sets and partial recursive functions. Just as there are "universal" (or, "complete" in terms of, say, 
many-one reduction) recursively enumerable sets but no universal recursive sets, there is a universal 
semicomputable semimeasure in the sense of the following proposition, even though there is no 
universal computable measure. 

Let U be an optimal prefix Turing machine used in the definition of K(x), and let zi, Z2, ■ ■ ■ be 
an infinite sequence. Then the quantity U(z) is well-defined: it is the output of U when z is written 
on the input tape. Let Z\, Z2, ■ ■ ■ be an infinite coin-tossing 0-1 sequence, and let us define 

(2.1) m'(x) = Prob[U(Z) = x}. 

Proposition 2.1 (Levin). There is a semicomputable semimeasure fi with the property that for any 
other semicomputable semimeasure v there is a constant c„ > such that for all x we have c v u(x) ^ 
fi(x). Moreover, /1 = m'. 

Proof sketch. We define a Turing machine T that will output a sequence ip t ,x t ,r t ) where r t is a 
positive rational number. At any time t, let rt(p,x) be defined as follows. If there is no i ^ t for 
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which some (p, x, ri) has been outputted then r t (p,x) = 0; otherwise, rt(p,x) is the maximum of 
those r.j . The machine T will have the following property for all p: 

(2.2) 5> t (p,a0<l. 

To define T, take a universal Turing machine V(p,x,n). Let T simulate V simultaneously on all 
inputs. If at any stage of the simulation, some V(p, x, n) has been found, then T checks whether it can 
interpret V(p, x, n) as a pos itive rational number r, and whether it can output the triple (p, x, r) while 
keeping the condition ( [2.2] ). If yes, the triple is outputted, otherwise it is not, and the simulation 
continues. Define v(jp,x) = lhn t r t (p,x). Then it is easy to check that fi(x) = 2~ p ~ 1 v(j>, x) 
satisfies the conditions of the proposition. 

To show fx = m', note that the random variable whose distribution is fj, can be represented as a 
function of the coin-tossing infinite sequence. It is not difficult to check that the function in question 
now can be implemented by a prefix Turing machine. □ 

We will call any semicomputable semimeasure fi with the property in the proposition "universal" . 
Any two universal semimeasures dominate each other within a multiplicative constant. We fix one 
such measure and denote it by 

m(x) 

and call it the universal probability. Its significance for complexity theory can be estimated by by 
the following theorem, deriving the prefix complexity K(x) from the universal probability. 

Proposition 2.2 (Levin's Coding Theorem). We have K(x) — — logm(x). 

The lower bound (— logm(x)) < K(x) comes easily from the fact that K(x) is upper semicom- 
putable and satisfies the "Kraft inequality" Ylx^~ K ^ ^or ^ ne P ro °f °f the upper bound, 
see §. 

The above concepts and results can be generalized to the case when we have an extra parameter 
in the condition: we will therefore talk about m(x | N), the universal probability conditional to N, a 
function maximal within a multiplicative constant among all lower semicomputable functions /(x, N) 
which also satisfy the condition f(x, N) ^ 1. The coding theorem generalizes to 2~ K ^ x]>N ^ = 
m(x | N). 

Constructive objects other than integers or strings can be encoded into integers in some canon- 
ical way. Elementary quantum states \if>) £ H.n also correspond to integers, and this is how we 
understand the expression 

m(|V) | AO, 

which is therefore nonzero only for elementary states \tp). (This is not our definition of quantum 
universal probability or complexity, only a tool from classical complexity theory helpful in its dis- 
cussion.) 

The quantum analog of a probability distribution is a density matrix, a self-adjoint positive 
semidefinite operator with trace 1. Just as with universal probability, let us allow operators with 
trace less than 1, and call them semi-density matrices. 

We call a sequence An of operators, where An is defined over TLm, lower semicomputable if there 
is a double sequence of elementary operators A^k with the property that for each N, the sequence 
A?jk is increasing and converges to Apf. 

Lemma 2.3. 

1. A computable sequence of operators is also lower semicomputable. 

2. If An is nonnegative then the elements of the sequence A^k can be chosen nonnegative. 

Proof. Both these statements are proved via standard approximations. □ 

From now on, we suppress the index N whenever it is not necessary to point out its presence for 
clarity. 
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Theorem 2. There is a lower semicomputable semi-density matrix \i dominating all other such ma- 
trices in the sense that for every other such matrix a there is a constant c a > with c a a ^ fi. We 
have /j, = fi 1 where 

(2.3) // = ]Tm(hA»h/>>(# 

W 

Also 

» = J2 m ^ u = m(P)P/ dim P 

v P 

where v runs through all elementary semi-density matrices and P runs through all elementary pro- 
jections. 



2.1 



Proof. The proof of the existence of fi is completely analogous to the proof of Proposition 

To prove /J, = fi', note first that the form of its definition guarantees that fj,' is a lower semi- 
computable semi-density, and therefore /x' < fi. It remains to prove fi < fi' . Since fi is lower 
semicomputable, there is a nondecreasing sequence fi k of elementary semi-density matrices such 
that fi — lim/c fj, k , with fi = 0. For k 1, let 8k = Hk — Mfe— l- Each of the nonnegative self-adjoint 
operators 8k can be represented as a sum 



8k = /,Pki\<j>ki)(<t>ki\- 
i=i 

Thus, fx = Y*kiPki\4>ki)((l)ki\, with a computable sequence pki > 0, where ^ k i Pki < 1- The vectors 

\4>ki) and the values p nk can be chosen elementary. Noting p k i < m(fc, i) < m(\<f>ki)) finishes the 
proof. 

The statement of sum representations using projections and elementary density matrices is weaker 
than the statement about fi' . □ 

We will call \i the quantum universal (semi-) density matrix. Thus, the quantum universal 
probability of a quantum state \tp) is given by 

{i(i\fx\tp}. 

A representation analogous to (2.1) holds also for the quantum universal probability fi. It is not 
necessary to introduce a quantum Turing machine in place of a classical Turing machine, since instead 
of outputting an elementary quantum state we can just output the probabilities themselves, 
leaving the preparation of the state itself to whatever device we want, which might as well be a 
quantum Turing machine. The output of U(Z) classically is a probability distribution over the set 
of strings: string x comes out with probability m(x). When the outputs are quantum states \<p) 
with probability m(\4>)), then the relevant output is not the distribution \cf>) i— > m(\<j)}): by far not 
all this information is available. The actual physical output is just the density matrix as given 
in (2.3). Thus, we take the projection associated with each possible output \<f>), multiply it with its 
probability and add up all these terms. Indeed, assume that A is any self-adjoint operator expressing 
some property. The expected value of A over U(Z) is given by TrA/x'. In particular, suppose that 
for some quantum state \tp) we measure whether U(Z) — \tp). The measurement will give a "yes" 
answer with probability 

^m(|0))|(0|^)| 2 = ^m(|^))^|(|0)^|)|^ 
\<P) 10) 

= (VVK>)=Tr|V>>(#x'. 
These analogies suggest to us to define complexity also as a self-adjoint operator: 
(2.4) k = - log/x. 



Proposition 2.4. The operator function A \— > log A is monotonic. 
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For a proof, see pi. This implies the upper semicomputability of (— log/x). For some readers to 
appreciate that the proposition is nontrivial, we mention that for example A i— > e A is not monotonic 
(see the same references) . We will also use the following theorem, which could be called the "quantum 
Jensen inequality" : 

Proposition 2.5. If f(x) is a convex function in an interval [a, b] containing the eigenvalues of oper- 
ator A then for all \ip) we have 

(2.5) fMAW)) < 

Proof. Easy, see @. □ 
This implies: 

Lemma 2.6. Let f be a function concave in the interval [a,b], and \ip) a vector. Then the function 
A l— y ('/'l/( y l)IV') * s concave for self-adjoint operators A whose spectrum is contained in [a, 6]. 

We have now two alternative definitions for quantum complexity of a pure state, depending on 
the order of taking the logarithm and taking the expectation: 

(2.6) £(IV»» = -log(V»lA#>. 

(2.7) = -(V>|(lo gi uM = 

An inequality in one direction can be established between them easily: 
Theorem 3. 

H(\ti>)) ^n(\i>)). 

Proof. Use (|J). □ 

The difference between the two quantities can be very large, as shown by the following example. 

Example 2.7. Let |1), . . . , \N) be the eigenvectors of /it ordered by decreasing eigenvalues pi. Then 
p x = 1 and p N = AT -1 . For vector \if>) = 2~ 1 / 2 (|1) + \N)) we have 

H(H)) = - log <-0|A*|-0> = - log(pi/2 + pat/2) ± 0, 

= (i/>\k\iI>) = (-logpi -logp w )/2 ± (log7V)/2. 

Which one of the two definitions is more appropriate? We prefer H since we like the idea of a 
complexity operator; however, in the present paper, we try to study both. 

The complexity Kq introduced in [|| can be viewed as the formula resulting from H_(\ip)) when 
the sum in ( |2.3| ) is replaced with suprcmum. In classical algorithmic information theory, the result 
does not change by more than a multiplicative constant after replacement, but Theorem |l| shows 
that it does in the quantum case. 

Remark 2.8. It seems natural to generalize H(\ip)) and H_(\i/j)) to density matrices p by 

H{p) = Tr k P , H(fi) = - log Tr fip, 

but we do not explore this path in the present paper, and are not even sure that this is the right 
generalization. <£> 

3. Properties of algorithmic entropy 

3.1. Relation to classical description complexity. It was one of the major attractions of the original 
Kolmogorov complexity that it could be defined without reference to probability and then it could be 
used to characterize randomness. Unfortunately we do not have any characterization, even to good 
approximation, of H(\ip)) or H_(\tp)) in terms avoiding probability. As a generalization of classical 
complexity it has the properties of classical complexity in the original domain, just as Kq and qubit 
complexity. 
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Theorem 4. Let |1), |2), . . . be a computable orthogonal sequence of states. Then for H = H or H_, 

we have 

(3.1) H(\i))±K(i), 

where the constant in = depends on the definition of the sequence. 

Proof. The function f(i) = (i\fi\i) is lower semicomputable with f(i) ^ 1, hence it is dominated 
by m(i). This shows K(i) < H(\i)). 

On the other hand, the semi-density matrix p — £\ m(i)|i)(i| is lower semicomputable, so p < /x, 
— logp > k, hence 

K(i) = (i\(-logp)\i) > = H(\i)). 

□ 

3.2. Upper and lower bounds in terms of small simple subspaces. The simple upper bound follows 
immediately from the domination property of universal probability. 

Theorem 5. Assume that \ip) G TLn ■ Then 

k < (log AT) 1. 

In particular, if \ip) G Q n then H(\iJj)) < n. 

* + 

Proof. Let p = N~ x l, then p < /J-, hence k < (log N)l. □ 

Remark 3.1. N is an implicit parameter here, so it is more correct to write k(- \ N) < (log N)l. We 
do not have any general definition of quantum conditional complexity (just as no generally accepted 
notion of quantum conditional entropy is known), but conditioning on a classical parameter is not 
problematic. <£> 

There is a more general theorem for classical complexity. For a finite set A let K(A) be the length 
of the shortest program needed to enumerate the elements of A. Then for all x G A we have 

K{x) < K(A) + log # A + 2 log #A. 

What may correspond to a simple finite set A is a projector P that is lower semicomputable as a 
nonnegative operator. What corresponds to ffA is the dimension TrP of the subspace to which P 
projects. What corresponds to x G A is measuring the angle between and the space to which P 
projects. 

Theorem 6. Let P be a lower semicomputable projection with d = TrP. We have 

(3.2) H(\tf>)) < K(P) + logd- log 

(3.3) H{\ip)) < K(P) + \ogd+{l - (i/)\P\il>))logN. 
Proof. Let p be the semi-density matrix 

5(7 + ^) = lw + PW*-W) 

From the first form, it can be seen that it is semi-density, from the second form, it can be seen that 
it is lower semicomputable. By Theorem ^, we have 2 K ^p < /x. Since K{p) = K(P), we have 

H(\1>)) = - log (V>HV> < K ( p ) + l °S (1>\(P/<£)\1>) = K{P) + \ogd- log (V|P|V>. 
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On the other hand, 

< K{P) + (V>|P|V)logd+(l - (^|P|^»logJV. 

□ 

This theorem points out again the difference between H_ and H. If \ip) has a small angle with a 
small-dimensional subspace this makes H_{W)) small. For H(\ip)), the size of the angle gets multiplied 
by log TV, so if nothing more is known about \ip) then not only the dimension of P counts but also 
the dimension of the whole space we are in. 

Above, we defined what it means for a program to recursively "enumerate a subspace" by saying 
that it approximates the projector from below as a nonnegative operator: call this "weak enumer- 
ation" . There is a simpler possible definition: let the program just list a sequence of orthogonal 
vectors that generate the subspace: call this "strong enumeration" . 

Remarks 3.2. 

1. The rest of the paper makes no use of the discussion of strong and week enumeration, so this 
part can be skipped. 

2. What is important is not only that the sequence of vectors in question can be enumerated, 
since this is in some sense trivially true for any finite sequence of elementary vectors. A 
recursively enumerable finite-dimensional subspace is always elementary. What matters is 
that the enumeration is done with a short program (which can use the dimension N as input). 
Without this remark, there is clearly no difference between an elementary subspace and a 
strongly enumerable one. 



Proposition 3.3. The strong and weak kinds of enumeration of a subspace are equivalent. In other 
words, there is a program of length k enumerating a subspace in the weak sense if and only if there 
is a program of length = k enumerating it in the strong sense. 

Proof. Given a strong enumeration \4>i), \4>2), • • ■ , the sum J^. |<fo)(</>j| clearly defines the projector 
in a form from which the possiblity of approximating it from below is seen. 

Assume now that P is a projector and p± ^ p2 ^ • • • is a sequence of elementary nonnegative 
operators approximating it. 

Note that for a nonnegative operator A, we have (^lA]^) = iff A\ip) = 0. Now for any of 
the pi, and any vector if P\ip) = then (ip\P\ip) = 0, which implies (ip\pi\tp) = and thus 
Pi \ip) = 0. Hence the kernel of pi contains the kernel of P and hence the space of eigenvectors of pi 
with nonnegative eigenvalues is contained in PTL. This shows that from pi, i = 1,2,... we will be 
able to build up a sequence \<f>i), \4>2), ■ ■ ■ of orthogonal vectors spanning PH. □ 

Theorem ^ below is analogous to the simple lower bound on classical description complexity. That 
lower bound says that the number of objects x with K{x) < k is at most 2 k . What corresponds 
here to "number of objects" is dimension, and the statement is approximate: if \ip) has complexity 
< k then it is within a small angle from a certain fixed 2 fe+1 -dimensional space. The angle is really 
small for H; it is not so small for H_ but it is still small enough that the whole domain within that 
angle makes up only a small portion of the Hilbcrt space. 

Let 1^2), ... be the sequence of eigenvectors of fi with eigenvalues p,\ ^ /X2 ^ • ■ • • (Since our 
space is finite-dimensional, the sequence exists.) Let Ki = — log/^. Let Ek be the projector to the 
subspace generated by \ui), ■ ■ ■ , |itfc). 

Remark 3.4. The universal density matrix pi is an object with an impressive invariance property: for 
any other universal density matrix v we have v = pi. On the other hand, the individual eigenvectors 
\v,i) probably do not have any invariant significance. It is currently not clear whether even the 
projectors E^ enjoy any approximate invariance property. 
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Theorem 7 (Lower bounds). Let be any vector and let A > 1. If H(\%jj)) < k then we have 

(3.4) {ip\E 2 >. k \il)) > 1-1/A. 
If H < k then we have 

(3.5) (ip\E\2«\i>) >2- fe (l-l/A). 

Proof. Assume H(\ip)) < k and expand in the basis {|iti)} as \if>) = J^. Ci\ui). By the assumption, 
we have Ki\ci\ 2 < k. Let m be the first i with Ki > Xk. Since Y^i 2~ Ki < 1 we have m ^ 2 xk . 
Also, 



Xk^ \a\ 2 < Kl ^ 2 < fc ' 



hence X«> m l c i| 2 < 1A> which proves (3.4) 



Now assume H_{\il>)) < k, then we have X!iMi| c il 2 . Let m be the first i with /ii < 2 /A. 
Since /i, < 1 we have m ^ 2 fc A. Also, 

^^| Cl | 2 <2- fc /A^| C4 | 2 = 2- fc A, 

i ^ m i 

hence 

(^|£ ro |^) = £ |c 4 | 2 > J2 wM 2 > 2" fc - J2 MiM* 

(3.6) i<rn -i<m -i^m 

>2- fe (l-l/A). 

□ 

The defect of this theorem is that the operators are uncomputable. I do not know whether 
the above properties can be claimed for some lower semicomputable operators F^. 

3.3. Quantum description complexities. 

3.3.1. Vitdnyi's complexity. Theorem |]says that the complexity Kq from B, (defined in Section 



is not too much larger than H_, so we do not lose too much in replacing the sum (2.3) with a 
supremum: if the sum is > 2~ k then the supremum is > 2~ 4fe /fc 2 . 

Theorem 8 (Relation to Kq). 

(3.7) H < Kq < AH + 2 logH. 

Proof. We start from the end of the proof of Theorem We use (|3.6| ) with A = 2, and note that 
one term, say, \c r \ 2 of the sum ^2 i<m \ci\ 2 must be at least 2~ 2fe ~ 2 . We would be done if we could 
upperbound K(\u r )) appropriately. It would seem that K(\u r )) can be bounded approximately by 
k since m ^ 2 k+1 . But unfortunately, neither the vectors \ui) nor their sequence are computable; 
so, an approximation is needed. Let r be the largest binary number of length ^ k smaller than Tr /i. 
Then there is a program p of length ^ k + 2 log k computing a lower approximation fi of fi such that 
Tr/x — Tr p, ^ 2~ k . Indeed, let p specify the binary digits of r and then compute an approximation 
of Tr/x that exceeds r. 

The condition (V>|a#) > 2~ fe implies {ip\p\i>) > 2~ k+1 . We can now proceed with p as with fi. 
We compute eigenvectors \ui) for p, and find an elementary vector \u r ) with 

K(\u r )) < 2k + 21ogfc, |(V>|ur)| 2 > 2 ~ 2fc . 
The extra k + 2 log k in K(\u r )) is coming from the program p above. □ 
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3.3.2. Qubit complexity. Let us define the qubit complexity introduced in jjj. We refer to that 
paper for further references on quantum Turing machines and detailed specifications of the quantum 
Turing machine used. Our machine starts from an input (on the input tape) consisting of a qubit 
program and a rational number e > 0. On the output tape, an output appears, preceded by a 0/1 
symbol telling whether the machine is considered halted. The halting symbol as well as the content 
of the output tape does not change after the halting symbol turns 1. (The input tape, which is also 
the work tape, keeps changing.) We can assume that input and output strings of different lengths 
can always be padded to the same length at the end by 0's, or if this is inconvenient, by some special 
"blank" , or "vacuum" symbol. The input of the machine is a density matrix p. For any segment 
of some length n of the output, and any given time t there is a completely positive operator $>k.t 
such that the n symbols of the output at time t are described by a density matrix a = $k.tP- We 
only want to consider the output state when the machine halted. If H is a projection to the set of 
those states then the semi-density matrix HaH is the output we are interested in. The operation 
^n,t '■ p | — > HaH is a completely positive operator but it is not trace-preserving, it may decrease the 
trace. It is also monotonically increasing in t. 

For a state 1^), let QC e (|^>)) be the length k of the smallest qubit program (an arbitrary state 
in Qki or more precisely the density matrix corresponding to this pure state) which, when given as 
input along with e, results in an output density matrix a with (ip\cr\ip) ^ 1 — e. The paper JIJ shows 
that this quantity has the same machine-independence properties as Kolmogorov complexity, so we 
also assume that a suitable universal quantum Turing machine has been fixed. For the following 
theorem, we will compute complexities of strings in Hn = Qn, so N = 2 n . 

Lemma 3.5. If for a semi- density matrix p and a state we have {ip\p\^p) ^ 1 — e and p has the 
eigenvalue decomposition Pi\i) (i\ where p\ ^ P2 ^ • • • , then 

Pi>l-e, |<1|V)| 2 > l-2e. 

Proof. Let a — then (tp\p\il>) = J2iPi\ c i\ ^ 1 — £• Hence pi ^ 1 — e, therefore 

| Cl | 2 + o£>|c*| >l-e t 

i 

giving |c?| >1-2e. □ 

Theorem 9. Fore < 0.5, if QC E (\ip)) < k then 

H(\ip)) < k + K(k) + 2en. 
Proof. For each k, let Ik be the projection to the space Qk of /c-length inputs. The operator 

A = m{k)2- k I k 

k 

is a semicomputable semi-density matrix on the set of all inputs. For each time t, the semi-density 
matrix ^ n ,t^ is semicomputable. As it is increasing in t, the limit v — lim t "Jn^A is a semicomputable 

semi-density matrix, and therefore v < \x. Let \<f>) £ Qk, then ^ Ik, hence m(fc)2~ \4>)(4>\ ^ A, 

hence for each t we have 

m(fc)2- fc * t , fc |^)(V'K^<^ 
Since also 2~ n I n < fx, we can assert, with p t .k = ^t,k\4')( ( l ) \^ that 

cr = m(k)2- k p t . k + 2~ n I n < /Lt. 



Assume that (ip\pt,k\ip) ^ 1 — e. Then by Lemma 3.5, if pt^k has the eigenvalue decomposition 
t nen Pi ^ 1 — £ and |(l|i/;}| 2 ^ 1 — 2e. The matrix (— logcr) can be written as 

-^log(m(fc)2-V l + 2- n )| i )(z|. 



QUANTUM ALGORITHMIC ENTROPY 



11 



Hence, with Ci = (i\ip), and using Lemma 3.5 and e < 0.5 

-(ip\logfi\ip) < -(ip\loga\ip) 
= E 



^log(m(fc)2- fe K + 2- n )| Cl | 2 



< k + K(k) + log(l - e) + 2en. 

In the last inequality, the first two terms come from the first term of the previous sum, while 2en 
comes from the rest of the terms. □ 

Using the definitions of Q, we write QC(|^>)) ^ k if there is a \<f>) such that for all e of the form 
1/m, when \<f>) is given as input along with e, we get an output density matrix a with (^|cr|^>) 1— e. 
The above theorem implies that in this case, 

(3.8) H{\i>))<k + K{k). 
Let a: be a bit string, then we know from that 

(3.9) H(\x))±K(x). 

It has been shown in jlj that QC(|x)) < C(x) where C(x) is the (not prefix-free) Kolmogorov 
complexity. We can show directly that also C(x) < QC(|x)), but we will not do it in this paper. 



It follows from Q and Q that K(x) = H(\x)) < QCflx)) + Jf(QC(|x»). This is in some way 
stronger, since another interesting quantity, H(\x)) is interpolated, and in another way it seems 
slightly weaker. But only very slightly, since one can bound K{x) by C(x) in general only via 

K[x) < C{x) + K (C(x)). 



Just as we obtained an upper bound on Kq using (3.5) combined with an approximation of the 
uncomputable /x, we may hope to obtain an upper bound on QC using ( |3.4| ) combined with a suitable 
approximation of the uncomputable fi or (— log/x). But we did not find an approximation in this 
case for a reasonable price in complexity: the best we can say replaces H(\if)}) with (^|(— log/i)|^) 
for any computable density matrix [i. Or, we can upperbound not QC(| - 0)) but QC(| , i/') | x) where 
X is an encoding of the halting problem into a suitable infinite binary string. The concept of an 
oracle quantum computation with a read-only classical oracle tape presents no difficulties. 

Theorem 10. For each rational e and any computable density matrix (i we have 

QC £ (|^}) < {VK-logMM/e + AM. 

Similarly, 

qC%\^)\ X )<H(m/e. 



Proof. For the second inequality, we can use (3.4) with k = H(\tp)) and A = l/s. The oracle x 
allows us to compute the space E 2 \k with arbitrary precision. Then our quantum Turning machine 
can simply map the space of Afc-length qubit strings into the (approximate) E 2 *k . 

Similarly, for the first inequality, if /i is computable then we can compute the subspaces corre- 
sponding to E 2 \k with arbitrary precision. □ 

3.4. Invariance under computable transformations. 

Theorem 11. Let U be any computable unitary transformation. Then we have 

H(um ± H(m, mm ± mm- 

Proof. Straightforward. □ 

This theorem needs to be generalized: it should be understood how complexity changes under a 
completely positive operator. 
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4. Complexity and entropy 

In classical algorithmic information theory, if p is a discrete computable probability distribution 
then its entropy is equal, to a good approximation, to the average complexity. In the quantum case, 
entropy is defined as 

S(p) = -Trplogp. 

There is a quantity corresponding to the Kullback information distance, and called relative entropy 
in 0: it is defined as 

S(p II a) =Tr /9 (logp-log t T), 
where p and a are density matrices. 
Proposition 4.1. 

(4.1) S(p || a) > 0. 

Proof. See §. □ 
The following theorem can be interpreted as saying that entropy is equal to average complexity: 
Theorem 12. For any lower semicomputable semi-density matrix p we have 

(4.2) S(p)=TtpK 



Proof. Let Q = Tr/x, then a — /x/f2 is a density matrix, and hence by (hi), S(p \\ a) ^ 0. It follows 
that S(p) < Tr pn. 

On the other hand, since p < /x, the monotonicity of logarithm gives k < — logp which gives the 
other inequality. □ 

For what follows the following property of logarithm is useful: 

Lemma 4.2. If A and B are nonnegative operators over X and Y respectively, then 

(4.3) log A ® B = (log A) ® 1 Y + lx ® (log B) . 

Proof. Direct computation. □ 

Some properties of complexity that can be deduced from its universal probability formulation will 
carry over to the quantum form. As an example, take subadditivity: 

K(x,y) < K(x)+K(y). 
What corresponds to this in the quantum formulation is the following: 
Theorem 13 (Subadditivity). We have 

(4.4) fj, x ® fi Y < fx XY . 

For \(f)) ,\ip) d Hn and H = H or H we have 

(4.5) Hmm<Hm) + H(m. 

Proof. Th e de nsity matrix fi x (3 fj, Y over the space TLxy — 7~lx ® is lower semicomputable, 
therefore (4.4) follows. Hence 

MvxWMpyW)) = {4M(i*x®i*y)\<I>)W) 
< (4>Mn XY \<l>)\ii>). 



which gives (4.5) for H = H_. For H = H note that by the monotonicity of logarithm, identity (4.3) 
and (4.4) implies 

(logMx) ® ly + lx ® (logMy) = log/x x ® n Y < log/x xy . 
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Taking the expectation (multiplying by on left and \ip) on right) gives the desired result. □ 
The analogous subadditivity property also holds for the quantum entropy S(p). 

For classical complexity we have K(x) < K(x, y), and the corresponding property also holds for 
classical entropy. This monotonicity property can also be proved for quantum complexity. 

Theorem 14 (Monotonicity). We have 

(4.6) Tt y fi XY =fi x , 

(4.7) k X y > Kx ® ly- 
For \4>), £ Ti-N , and H = H or II we have 

(4.8) H(\cf>))<H(\4>)\iP)). 

Proof. Let px — Try fixY- Then px is a semicomputable semi-density matrix over Tix and thus 
px < fix- At the same time, for any fixed vector |^), the matrix gxy — fix ® IV 7 ) (V 7 1 i s a lower 
semicomputable semi-density matrix, hence fixy > a XY- Taking the partial trace gives 

ii-x = Try a XY < Try fi XY = p x . 



This proves (4.6), which implies the inequality for 

Let {iV'i)} be any orthogonal basis of Hy with \ipx) = \tp). Then we have 

(<t>\(i,\ fix ®i Y \cf>)\iP) = (4>\ti x \ ( t>) 

4 wiry = Y(<i>m\»xY\<t>M) > &M» XY \m), 



which proves fi x ® ly > \* X y- Taking logarithms and noting that logly = 0, we get (4.7) which 



proves the inequality for H . □ 

The quantum entropy analog of this monotonicity fails in a spectacular way. It is not true in 
general that S(p x ) ^ S{p X y)- Indeed, p X y could be the density matrix of a pure state, and then 
S(p X y) = 0. At the same time, if this pure state is an entangled state, a state that cannot be 
represented in the form of |(/>)|-0), only as the linear combination of such states, then S(p x ) > 0. 
This paradox does not contradict to the possibility that entropy is "average complexity". It just 
reminds us that Theorem |lj says nothing about entangled states. An entangled state can be simple 
even if it is a big sum, but in this case it will contain a lot of complex components. 

5. The cloning problem 
5.1. Maximal complexity of cloned states. For classical description complexity, the relation 

K(x, x) = K{x) 

holds and is to be expected: once we have x we can copy it and get the pair (x, x). But there is a 
"no cloning theorem" Q in quantum mechanics saying that there is no physical way to get 
from \ip). It is interesting to see that a much stronger form of this theorem also holds, saying that 
sometimes H(\tp)\tp)) is much larger than H(\ip)) (of course, at most twice as large). Moreover, we 
can determine the maximum complexity of states of the form \ip) . Our results in this are very 
similar in form to those of jjj, and the proof method is also similar. 

For \ip) 6 Hn, let |i/>)® m denote the m-fold tensor product of \ip) with itself, an element of H® m . 



Let 

ON.m — it C n N 
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be the subspace of elements of W® m invariant under the orthogonal transformations arising from 
the permutations 

\4>l) ■ ■ ■ \4> m ) l-> 1^(1)} • ■ ■ \<t> v {m))- 

Lemma 5.1 (see ||). 

1. dim<Sjv, m = ( m+ ^ _1 )- 

2- £/v,m is invariant under unitary transformations of the form U® m . 

3. // a density matrix over iS/v, m commutes with all such transformations then it is a multiple of 
unity. 

Let 

(5.1) C N . m = max H(\^f m ), 

and let C_ N m be defined the same way with H_ in place of H. 
Theorem 15. We have 

77 ~t Trr \ , fm + N 

Cjv,m < iv (mj + log 

\ m 
/m + iV-P 
\ m 

Proof. The upper bound follows from the fact that \ip) G 6>Ar, m and from ( |3.3j ). 

For simplicity, let us write for the moment, \ip) m — \ip)® m . For the lower bound, let us first set 
c = C_N,m- We have 

(5.2) Tr^) m (^r = (^l"Vl^) m ^2- 

for all states £ H-n- Let P5 be the projection to <S/v,m- Let A be the uniform distribution on 
the unit sphere in TLm- Then 

P = J \^) m (M m dA 

is a density matrix over 5/v im . It commutes with all unitary transformations of the form [7® m ; and 
therefore according to Lemma 5.1, 

'm + N - l\ _1 
m 



Ps- 



m + N-V\ 1 m „ fm + N-1 



Integrating (5.2) by dA we get 

2~ c s$ Tr/xp = ( " "' "' ~) TrfiPs^ 

\ m J \ m 

Taking negative logarithm, we get the lower bound on C_. □ 

5.2. An algebraic consequence. This subsection says nothing new about quantum complexities, it 
only draws some technical inferences from the previous subsection. 

The problem of estimating H(\ip) \ip)) can be reformulated into an algebraic problem for which 
we are not aware of any previous solution. The results obtained above solve the problem: maybe 
such a solution will also have some independent interest. For any N x N matrix A, let 

Tr^T^4 , Y^-otj 

where otj are the eigenvalues of A^A. The function u(A) measures the "unevenness" of the dis- 
tribution of eigenvalues of A^A. It can vary between 1/N for A = 1 and 1 (when A^A has rank 
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1). For a subspace F of the vector space of symmetric (not necessarily self-adjoint!) matrices, let 
u(F) = msLXAeF u(A). Let N' = N(N + l)/2. For < d < N', we are interested in the quantity 

u(d,N) = min{u(F) : dimF ^ d}. 
Theorem 16. We have u(d, N) ^ d/N' . 

Remark 5.2. This theorem has been strengthened from its preprint version. 

Before the proof, we give some lemmas setting up the connection with cloning. 
Lemma 5.3. Let A be a symmetric N x N matrix {a%j) and let 

ij 

Then 

(5.3) sup |(a|(|0)|0))| 2 = U (A). 

\<f>)£H N 

Proof. We can restrict ourselves to matrices A with TrA^A = (a\a) = 1. Then with \ip) — |</>}|</>), 

\4>) = Y,i x i\0i), 

\{a\ip)\ 2 = l^aijXiXj] 2 = \x T Ax\ 2 , 

ij 

where x T is the transpose of x (without conjugation). 

By singular value decomposition (see ||), every matrix can be written in the form VDU where 
D is a nonnegative diagonal matrix and U, V are unitary transformations. If the elements of D 
are all distinct, positive and in decreasing order then U, V are unique. In this case, clearly if A is 
symmetric then V = U T . This can be generalized to the case when the elements of D are not all 
positive and distinct, using for example limits. Thus, A = U T DU. This gives x T Ax = x T U T DUx = 
{Ux) T D{Ux). As x runs through all possible vectors with \xi\ 2 — 1, so does Ux. Let di be the 
largest element on the diagonal of D, then d\ — \A^ A\. 

\{Ux) T DUx\ = \Y,^{Ux) 2 \ ^J2 d *\( Ux )*\ 2 

i i 

since J^i \(Ux)i\ 2 — 1. The maximum of \(Ux) T D(Ux)\ 2 is achieved by the element x — 

and then it is d\ = u(A). □ 

Lemma 5.4. For < d < N' , there is a computable semi-density matrix p with 
sup - log (V>|/#) < log (AT' - d) - log(l - u(d, N)). 



Proof. Using the notation of Lemma |5.3| , let F be the subspace of dimension d of vectors a on which 
the minimum u(d, N) is achieved. Wi tn P = 1 — F, let p be the semi-density matrix defined in the 
proof of Theorem ^. Similarly to ( |3.2| ) we have, for any tp = \4>)\(j}): 

- log (rl>\p\ij>) ^ \og{N' - d) - log(l - 

Note that (ip\F\ip) = \(a\ip)\ 2 for some a 6 F, hence by ( [3.3] ) we have (i[)\F\i/j) ^ u, hence the last 
term of the right-hand side is ^ — log(l — u). □ 

Proof of Theorem |7^. The reasoning of Theorem |l5| implies that log N' lower-bounds the left-hand 
side in the above lemma. Thus, 

log N' sc log(l - d/N') + log TV' - log(l - u), 
u ^ d/N'. 

□ 
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6. Randomness tests 

6.1. Universal tests. In classical algorithmic information theory (see for example |J), description 
complexity helps clarify what experimental outcomes should be called random with respect to a 
hypothetical probability distribution. If the set of possible outcomes is a discrete one, say the set of 
natural numbers, then, given a probability distribution v, we call a lower semicomputable function 
f(x) a randomness test if f{ x ) v ( x ) ^ 1- It is known that there is a universal test t u (x), a 
test that dominates all other tests to within a multiplicative constant. An outcome is considered 
non-random with respect to v when t u {x) is large. In case of a computable distribution v, we have 

m(x) 



(6.1) t v (x) 



v(x) ' 



where the multiplicative constant in the = depends on v. (The general case is more complicated.) 
The deficiency of randomness is defined as d v (x) — log t v (x) . In case of a computable distribution 
v it is known to be 

(6.2) = — log v(x) + log m(x) = — log v{x) — K(x) 

Thus, for a computable distribution, the universal test measures the difference betwen the logarithm 
of the probability and the complexity. 

In the quantum setting, what corresponds to a probability distribution is a computable density 
matrix p. What corresponds to a function is a self-adjoint operator. So, let us say that a randomness 
test is a lower semicomputable self-adjoint operator F p with 

Remark 6.1. In the theorem below, the expression 

appears, which does not make sense if p is not invertible. However, let us write a = p}l 2 p~ 1 l' 2 \ this 
expression makes sense on the subspace V orthogonal to the kernel of p, and therefore T" = eft a also 
makes sense there. Therefore we define (i/j\T'\ip) as oo for any £ V, and there is no problem for 
|V>) G V. 

Theorem 17 (Universal test). There is a test T p which is universal in the sense that it dominates each 
other test R: we have R < T p , where the multiplicative constant in < may depend on R and p. We 
have T p = = Tp where 

t; = p-Wpp-v*. 



Proof. The proof of the existence of a universal test is similar to the proof of Proposition 2T. The 

proof of T = T" is similar to the one showing // = in Theorem ^. 

Let us prove T = T", To see that T" is lower semicomputable, note that as direct computation 

shows, for any operator C the function A t— > Cft AC is monotonic on the set of self-adjoint operators 

A with respect to the relation ^ . By the cyclic property of the trace, we also have Tr T"p = Tr fi ^ 1 . 

+ * 
This proves T" < T, it remains to prove that T < T" . This is equivalent to 



P 



1/2 



Tp 1 ' 2 < p l l 2 T"p 1 ' 2 = fx. 



But the left-hand side is a lower semicomputable nonnegative definite matrix whose trace is ^ 1, 
again due to the cyclic property of trace. Therefore by the defining property of /x, it is < fi. □ 
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The expression for Tjj is similar to (6.1), but it does not separate the roles of the density matrix p 
and of the universal probability fi as neatly, certainly not in the typical cases when fi and p do not 
commute. Assume that the eigenvalues of p are p\ ^ p2 ^ • • • , with the corresponding eigenvectors 
\vi) (these exist since our space is finite-dimensional). Let (rriij) be the matrix of the operator fi 
when expressed in this basis. For a certain state = J2% c i\ v i)i we can express the valu e of the 
test on as follows. If there is any i with pi — and q / then according to Remark 6.1, the 
value is oo. Otherwise, it is 

(6.3) WW - E TO 'i (PiPj)' 1 ^. 

The term (piPj)~ 1 / 2 c*Cj is defined to be if c*cj — 0, and we excluded the case when piPj = but 
c*Cj =^ 0. The roles of fi and p do not seem to be separable in the same way as in the classical case. 
However, if p is the uniform distribution then the expression simplifies to 

N 

AT 1 J2 m^c^N- 1 ^]^), 

which is the classical comparison of the probability to the universal probability. 

6.2. Relation to Martin-L6f tests. The sum for T' in Theorem p7| is similar to /i' in Theorem ||. In 
the classical case and with a computable p, just like there, it can be replaced with a supremum. In 
the quantum case it cannot: indeed, the expression of jit' is a special case of T', and we have shown 
in Section [3] that the sum in \i' cannot be replaced with supremum. We do not know whether there 
is still an approximate relation like in Theorem the proof does not carry over. 
It is worth generalizing the sum for T' p as 

. m(F)F 
^ Tr Fp 

F 1 

where F runs through all elementary nonnegative self-adjoint operators. An interesting kind of 
self-adjoint operator is a projection P to some subspace. Such a term looks like 

m(P) p 
TrPp 

This term is analogous to a Martin-L6f test. An outcome x would be caught by a Martin-L6f test 
in the discrete classical case if it falls into some simple set S with small probability. The fact that 
S is simple means that K(S) is small, in other words m(S) is large. Altogether, we can say that x 
is caught if the expression 

p(S) s[ ' 

is large, where ls(x) is the indicator function of the set S. In the quantum case, for state what 
corresponds to this is the expression 

The probability of S translates to Tr Pp, and ls(x) translates to (?/;|P|?/;). Thus, a quantum Martin- 
Lof test catches a state if it is "not sufficiently orthogonal" to some simple low-probability 
subspace. Compare this with Theorem ^ 

As we see, the universal quantum randomness test contains the natural generalizations of the 
classical randomness tests, but on account of the possible non-commutativity between p and /x, 
it may also test in some new ways that do not correspond to anything classical. It would be 
interesting to find what these ways are. 
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7. Proof of Theorem |l| 

Let us denote 

K m (\i>)) = mm{l(p) : U(p) = \<j>), - log | (0|^> | 2 sc: to }. 

The first lemma lowerbounds if o(|V'))j the later ones lowerbound K m (\ip)) for finite to. 

Lemma 7.1. For each k there is a subspace V of Q n , of dimension 2™ — 2 k with the property that for 
all \ip) £ V we have Kca{\ip)) k. 

Proof. Let pi, . . . ,p r be all programs of length < k for which U{p m ) £ Qn- Then r < 2 k . Let F be 
the set of elements of Q n orthogonal to all vectors of the form U{pi). □ 

Let b n denote the volume of the unit ball in an n-dimensional Euclidean space. Then for the 
surface volume s n of this ball we have 

(7.1) 6 ra _i < s„ = nb n . 

For an angle a, let s n (a) be the surface volume of a subset of the surface cut out by a cone of 
half-angle a: for some vector \u), this is the set of all vectors \x) of unit length with (u\x) ^ cosa. 
Thus, we have s n = s 71 (tt). We are interested in how fast s n (a) decreases from s n /2 to as a moves 
from ir/2 to 0. 

Lemma 7.2. Let a = ir/2 — y. Then 

(7.2) s n (a)/s n < exp(-7iy 2 /2 + Inn). 
Proof. We have, for k 2: 

(7.3) Sfc(a) = Sfe-i / sin fc_2 x dx ^ Sfe_ia sin fe_2 a. 

j o 

So, we need to estimate JgSm n xdx. The method used (also called "Laplace's" method), works for 
any twice differentiable function with a single maximum. Let g{x) = In sin x, then it can be checked 
that g'{n/2) = 0, g"(ir/2) = — 1, g"'(x) > for x < n/2. The Taylor expansion around n/2 gives, 
for y > 0: 

fl (7r/2 - y) = -y 2 /2 - y 3 g"'{n/2 - z)/6 < -y 2 /2. 
where < z < y. Hence, since since is increasing, we have for x < n/2 — y, 

sin"(x) < e~ ny2 / 2 . 



On the other hand, by (7.1), = Sk—i/(k — 1), showing s n _i < (n — l)s„. Hence 



□ 



Lemma 7.3. In any Hilbert space TL of dimension 2™ may oe a subspace of some Q r ), the volume 
fraction of the set of unit vectors \tp) in TL with the property that K m (\if>)) < k is 

< exp(-2"- m + /cln2 + n). 

Proof. We view Q n as a 2" +1 -dimensional Euclidean space. Assume — log K^l?/;)! 2 ^ to. If a is the 
angle between \<f>) and then this means 

2" m/2 < 1(01^)1 = cosa = sin(7r/2 - a) < n/2 - a, 

giving a < n/2 — 2~™ 1 / 2 . For a fixed |</>), the r elat ive volume (with respect to s 2 n+i) of the set of 



vectors with — log | (<j)\ip) | 2 ^ to is therefore by (7.2) 

< exp(-2 n+1 2- m /2 + n) = exp(-2"- m + n). 
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Let pi, . . . ,p r be all programs of length < k for which U(p m ) E Q n - Then r < 2 k . The volume of 
all vectors \ip) that are close in the above sense to at least one of the vectors U(pi) is thus 

< 2 fc exp(-2"- m +n) = exp(-2"- m + A: In 2 + n). 

□ 



Proof of Theorem [J. According to Lemma 7T, there is a subspace V of Q n , of dimension 2™ — 2™ 



l 



2™ 1 with the property that for all \tp) £ V, for all m we have K m (\ip)) ^ n — 1. Let m — n — 2 logn. 



We can apply Lemma 7.3 to this subspace V of dimension 2™ 1 , and obtain that for a certain 
constant c, the volume fraction of vectors with K m (\ip}) < 2n is 

< exp(-2("- 1) -("- 21ogn ) +2nln2+ (n - 1) + c) 

= exp(-n 2 /2 + n(21n2 + 1) + c - 1). 

If n is large this is smaller than 1, so there are states with Koo(\ij})) ^ n — 1 and ^£ n -2iogn(|V')) > 
2n. For these, clearly 

Kq(|^)) ^ (n-l) + (n-21ogn+l) = 2n-21ogn. 

□ 

8. Conclusions 

We advanced a new proposal to extend the theory of descriptional complexity to the quantum 
setting. The approach starting from the universal density matrix appears to be fruitful and leads to 
some attractive relations. However, the theory is still very incomplete. The following tasks seem to 
be the most urgent. 

1. Strengthen Theorem [h] in a way that the smallness of H(\ip)) allows a direct inference on 
the smallness of QC(\ip)) (or find a counterexample). For this, it seems to us that behavior 
of a monotonically increasing sequence of density functions needs to be understood better: 
namely, whether some approximate monotonicity can be stated about the subspaces Ek ■ Even 
if such a monotonicity will be found, even if Thoerem [l^ can be proved for instead of just 
computable density matrices, the result is too weak. To strengthen it, probably the theory 
of indeterminate-length quantum codes (the quantum analog of variable-length codes) will be 
needed, as developed in Q. 

2. Find the proper generalization to the quantum setting of the classical theorem saying that 
information cannot increase under the effect of any probabilistic computable transformation. 

3. What kind of addition theorems can be expected for quantum description complexity? The 
question is unsolved even for the von Neumann entropy. Also, the translation between the 
results on quantum description complexity and those on the von Neumann entropy will not 

be straightforward. As we remarked, the relation H(\<p)\tp)) > H(\<f>)) holds while S(px) ^ 
S(pxy) does not. Still, maybe the study of the problem for quantum description complexity 
helps with the understanding of the problem for von Neumann entropy, and its relation to 
coding tasks of quantum information theory. 

Despite all the caveats, let us ask the question (risking that somebody finds a trivial answer): 
does H obey strong superadditivity? 
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